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In several recent publications, Bettencourt, West and collaborators 
claim that properties of cities such as gross economic production, 
personal income, numbers of patents filed, number of crimes com- 
mitted, etc., show super-linear power-scaling with total population, 
while measures of resource use show sub-linear power-law scaling. 
Re-analysis of the gross economic production and personal income 
for cities in the United States, however, shows that the data cannot 
distinguish between power laws and other functional forms, includ- 
ing logarithmic growth, and that size predicts relatively little of the 
variation between cities. The striking appearance of scaling in pre- 
vious work is largely artifact of using extensive quantities (city-wide 
totals) rather than intensive ones (per-capita rates). The remaining 
dependence of productivity on city size is explained by concentration 
of specialist service industries, with high value-added per worker, in 
larger cities, in accordance with the long-standing economic notion 
of the "hierarchy of central places". 

Applied statistics | Model comparison | Urban economics | Urban scaling 
Central place hierarchy | Non-parametric smoothing 

Abbreviations: MSA, metropolitan statistical area; GMP, gross metropolitan product: 
BEA, Bureau of Economic Analysis; RMS, root mean square 

Introduction 

Recent dramatic advances in explaining metabolic scaling 
relations in biology by the properties of optimal transport 
networks P, [2| suggest the possibility of examining social as- 
semblages, especially cities, in similar terms. In a well-known 
series of papers, Bettencourt, West and collaborators 0, 0] 
claim that many social and economic properties of cities — 
gross economic production, total personal income, number of 
patents filed, number of people employed in "supercreative" 
[a| occupations, number of crimes committed, etc. — grow as 
super-linear powers of population size, while measures of to- 
tal resource use grow as sub-linear powers. These two claims 
imply that per capita output grows as a positive power of 
population, while per capita resource use shrinks as a nega- 
tive power. If reliable and precise scaling laws of this type 
exist, they would be of considerable importance for both sci- 
ence and policy QQ. 

Reasonable arguments from long-standing principles of 
economic geography would lead one to expect that larger cities 
would have higher economic output per capita, through a com- 
bination of the benefits to firms in related industries cluster- 
ing together ("agglomeration economies"), and the tendency 
of firms and specialists with large increasing returns to scale 
to be located high in the "hierarchy of central places". (For 
reviews of these concepts, including historical notes, mathe- 
matical models and empirical evidence, see Refs. 0, Si-) 
These arguments would carry over to producing technologi- 
cally useful knowledge and to "supercreative" services as well. 
However, these economic considerations do not point to either 
a particular functional form for the growth of per-capita out- 
put with population, or suggest that it should be very strong. 
Moreover, these theories do not look at individual cities as 
isolated monads, as scaling arguments do, but rather rely on 
there being assemblages of multiple cities (and rural areas), 
coupled by common economic processes, and assuming dis- 



tinct roles in those processes through a history of mutual in- 
teraction and combined and uneven development. 

The purpose of this note is to argue that, at least for 
the United States, while there is indeed a tendency for per- 
capita economic output to rise with population, power-law 
scaling predicts the data no better than many other func- 
tional forms, and worse than some others. Furthermore, the 
impressive appearance of scaling displayed in Refs. 0, 13 is 
largely an aggregation artifact, arising from looking at exten- 
sive (city-wide) variables rather than intensive (per-capita) 
ones. The actual ability of city size to predict economic out- 
put, no matter what functional form is used, is quite modest. 
These conclusions hold whether economic output is measured 
by gross metropolitan product or by total personal income. 
If we control for metropolitan areas' varying concentration 
of industrial sectors, we find that the remaining scaling with 
population is negligible, and much of the variance across cities 
is predicted by the extent to which they host specialist service 
providers with strongly increasing returns, as predicted by the 
idea of the hierarchy of central places. 

I begin by re-analyzing the gross metropolitan product 
data, showing that scaling is far weaker than it seemed in 
Refs. [3, 3|. I then □ re-analyzes the data on walking speed, 
originating in Ref . JiJ] and presented in Ref . 3] , which makes 
the problems with the scaling analysis very clear. Per-capita 
productivity is better predicted by how much a city depends 
on industrial sectors which indicate a high position in the hier- 
archy of specialist service provision. This actually eliminates 
any significant role for scaling with size. The conclusions sum- 
marize the scientific import of the data analyses. 

The supplemental information shows that (i) scaling also 
fails for personal income, (ii) the hypothesis of power-law scal- 
ing cannot be saved by positing a mixture of distinct scaling 
relations, and that (iii) contra Ref. jj] , neither a Gaussian nor 
a Laplace distribution is a good fit to the deviations from the 
power-law scaling relations. 

All calculations were done using R [ij, version 2.12. Code 
for reproducing figures and analyses is included in the supple- 
mental information. 
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"'"Bettencourt and West summarize their claims regarding this "unified theory of urban living" 6] 
thus: "We have recently shown that these genera! trends [to cities] can be expressed as simple 
mathematical laws"; "Our work shows that, despite appearances, cities are approximately scaled 
versions of one another . . . : New York and Tokyo are, to a surprising and predictable degree, non- 
linearly scaled-up versions of San Francisco in California or Nagoya in Japan. These extraordinary 
regularities open a window on underlying mechanism, dynamics and structure common to all cities" ; 
"Surprisingly, size is the major determinant of most characteristics of a city; history, geography and 
design have secondary roles". 
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Models 

Scaling Models Ref. [3| reported a power-law scaling be- 
tween the population of cities in the United States and 
their economic output. To be precise, the units of anal- 
ysis are "metropolitan statistical areas" (MSAs) as de- 
fined by the official statistical agenciefl The measure of 
economic output is the gross domestic product for each 
metropolitan area ("gross metropolitan product" or GMP), 
as calculated by the U.S. Bureau of Economic Analysis 
(http://www.bea.gov/regional/gdpmetro/), which is sup- 
posed to be the sum of all "incomes earned by labor and 
capital and the costs incurred in the production of goods and 
services" in the metropolitan area [T^PI. Ref. [3| analyzed 
data for 2006, deflated to constant 2001 dollars, and I will do 
likewise; the 2008 and 2004 data are not much different. 

Ref. 3 propose that output scales as a power of popula- 
tion, Y oc A'^''. This is connected to the data via the linear 
regression model 



Iny = lnc-|-61nA-|-e, 



[1] 



with e being a mean-zero noise term. For later comparisons, 
it will be convenient to denote this by y ~ cN^. 

There is a simple test of the model which has not, so far as 
I know, been applied before. If production does scale as some 
power of population, Y ~ cA*", then per-capita production 
Y/N = y should also scale. 



y ' 



cN" 



[2] 



and vice versa. As shown below, this transformation drasti- 
cally changes the apparent fit of the power-law scaling model. 

It is worth noting that there is no theoretical reason to 
expect a power-law scaling relation of the form of Eq. [1] for 
urban economies (while there are such reasons for biological 
scaling P, 01 ) ■ Accordingly, and unlike Ref. , I also consider 
a logarithmic scaling model. 



-InA/fc 



a logistic scaling model, 



In J/ ~ di 4- d2 : 



„(JV-d3)/d4 



1 -f g(iV-d3)/d4 

and finally a non-parametric scaling relation. 
In J/ ~ s(ln N) 



[4] 



[5] 



with s an unknown smooth function, to be determined by the 
data. By comparing multiple scaling models, including the 
fully data-driven Eq. (5] we can see to what extent the data 
actually provide evidence for particular functional form, or 
indeed for any strong scaling relation at all. 

Notice that Eq. [5] implies per-capita output grows with 
population without limit, and with constant marginal elastic- 
ity (the "15% rule" of 6]); according to Eq. [3]the growth is 
unlimited, but slows as population grows; and according to 
Eq. m per-capita output is asymptotically constant in popu- 
lation. 

Urban Hierarchy An alternative to size scaling is hier- 
archical structure. The "hierarchy of central places", intro- 
duced by Losch and Christaller in the 1930s, has become a 
corner-stone of urban economic geography. In outline, the 
idea is that developed economies contain many specialized 
goods, and especially services, that the mass of consumers 
need only rarely (such as the services of a surgeon), or indi- 
rectly (such as the services of a professor of surgery, or a maker 



of surgical instruments). The provision of such services has 
comparatively high fixed costs (the time needed to train a 
surgeon) but low marginal costs (the time needed to perform 
an operation), leading to increasing returns to scale. It thus 
becomes economically efficient for these specialists to locate 
in central places, where their fixed costs can be distributed 
over large consumer bases, and the more specialized they are, 
the more centrally located they need to be, and the larger the 
customer base they require. This logic leads to the formation 
of a hierarchy of market centers and cities, in which increas- 
ingly specialized skills, with (as it were) increasingly increas- 
ing returns, can be had, and so predicts positive associations 
between the population of urban centers, the concentration of 
specialist skills within them, and (owing to increasing returns) 
their per-capita economic output. Good reviews of the the- 
ory, including historical citations and connections to modern 
economic models of increasing returns, may be found in Refs. 
0,i. 

Fortunately, the BEA also provides estimates of the shares 
of gross metropolitan products attributable to different indus- 
trial sectors, some of which correspond to the specializations 
emphasized in central place theory. I specifically consider 
"Information, Communication, and Technology (ICT)", "Fi- 
nancial activities", "Professional and technical services" and 
"Management of companies and enterprises" (industry codes 
106, 102, 58 and 62, respectively jj. Writing the proportions 
of gross metropolitan product deriving from each of these sec- 
tors as x\ through Xi, the level of per-capita production can 
be predicted by a log-additive model [l5| which incorporates 
power-law scaling with city size: 



Iny = lnc + folnA + 2_^/j(a:j) +e, [6] 

where each of the "partial response" functions fj summarizes 
the contribution of the j^^ industrial sector. For comparison 
with the power- law scaling model (Eq. [T]), I have constrained 
the partial response function for size to be logarithmic; the 
other partial response functions can be nonlinear, though they 
must be smooth. 



Statistical Methods 

Power-law scaling relations, like Eq.[T] were estimated through 
ordinary least squares, i.e., minimizing "Yll^i ^^^i ~ ~ bin A^) 
where the index i runs over metropolitan areas, of which there 
are n = 366. As is well known [l^, this is a consistent es- 
timator of regression parameters for transformed regressions, 
even when Gaussian noise assumptions are violated, though 
the nominal values of standard errors and confidence intervals 
cannot be trustee^. The nonlinear but parametric models 
(Eq. [3] and m were fit by nonlinear least squares. 



"^To quote Ref. 1121 , MSAs are "standardized county-based areas that have at least one urbanized 
area with a population of 50,000 or more plus adjacent territory that has a high degree of social 
and economic integration with the core, as measured by commuting ties." 

■^A word on the BEA's procedure is in order 14|. The BEA estimates gross products for each 
industry for each state, and conducts surveys to estimate what fraction of each industry's state- 
wide earnings is located in each metropolitan area. Multiplying these ratios by the state-wide gross 
products, and summing over industries, gives the gross metropolitan product. The BEA provides 
no estimates of measurement uncertainty for these numbers. 

^The BEA withholds the GMP-contribution figures for some industry-MSA combinations, when 
the sector is so concentrated in that city that releasing the number would provide consequentia 
business information about specific firms. I have fit the model discussed below for the 133 cities 
with complete data in the four selected sectors. Experimenting with various forms of Imputation 
for the missing data did not materially change the results. 

^The analogous procedure for fitting power-law distributions is not reliable, due to differences 
between regression and density estimation. 
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The non-parametric size scaling relation, Eq. [5l was fit by 
means of a smoothing spline [l7l.[Ta| on the logged data. That 
is, the estimated spline is the function s minimizing 

n-^ J2 Ony» - s(lniVO)' + xj {s"{x)fdx [7] 

i 

with the smoothness penalty A > chosen by cross-validation. 
Smoothing splines of this type are universal approximating 
functions, and picking the penalty by cross-validation controls 
the risk of over-fitting non-generalizing aspects of the data — 
see Ref. [3 for details^ 

Finally, the additive model (Eq. O was estimated by com- 
bining spline smoothing for the non-parametric partial re- 
sponse functions fj , and an iterative "back-fitting" procedure 
[l5l |. (I used the mgcv library [3.) This adjusts for the 
correlations between industrial sectors, and between city size 
and industrial sectors, so that each estimated partial response 
function is, as far as possible, the unique additive contribution 
of that variable to economic output. 

Results 

Weakness of Scaling in Gross Metropolitan Products. Fitting 
Eq. [l]by least squares, I estimate b to be 1.12, in agreement 
with Ref. with a 95% bootstrap confidence interval [l^ 
of (1.10, 1.15). Figure [T] shows the data and the fitted trend, 
with both axis plotted on a logarithmic scale, so that a power 
law relationship appears as a straight line. The root-mean- 
squared (RMS) error for predicting InY is 0.23, and the "co- 
efficient of determination" is 0.96, i.e., the fitted values 
retain 96% of the variance in the actual data. 

Visually, this looks like reasonable data collapse. Plot- 
ting the per-capita values y however, as in Figure O reveals 
a very different picture, though the two should be logically 
equivalent under the power-law model. 

Figure [2] shows a trend curve for the the power-law scal- 
ing implied by Ref. 3!|. (The exponent estimated for y is 
0.12, matching that estimated for Y, as it must.) The fig- 
ure also shows the fitted logarithmic scaling relationship (Eq. 
[3]), which is extremely close to the power law over the range 
of the data, the logistic scaling relationship (Eq. and the 
non-parametric smoothing spline, corresponding to the rela- 
tionship y ~ e"'-'"^'. Note that the latter curve is not even 
monotonically increasing in A^. 

While the curves in Figure [2] correspond to very different 
modeling assumptions — the differences between the implica- 
tions of power-law and logistic scaling are perhaps especially 
striking — they all account for the data about equally well, or 
rather, equally poorly, because most of the variation in per- 
capita production is, in fact, unrelated to population. (Note 
that the vertical axis is linear, not logarithmic.) The RMS 
errors for In y of the power law, of logarithmic scaling and of 
logistic scaling are, respectively, 0.232, 0.234 and 0.229, while 
that of the spline is 0.225. They would predict y, for a ran- 
domly chosen city, to within ±26.1, ±26.3, ±25.7 and ±25.3 
percent, respectively. Predicting the same value of y for all 
cities, however, has an RMS error of 0.27, a margin of ±30%. 
Thus the R"^ values are, respectively, 0.24, 0.23, 0.26 and 0.29. 
On the linear scale, i.e., in terms of dollars per person- year 
y, the RMS errors of the power law, logarithmic, logistic and 
spline curves are, respectively, 7.9 x 10^, 7.9 x 10^, 7.8 x 10^ 
and 7.7 x 10'^, as compared to 9.2 x 10^ for predicting the 
mean for all citiesQ In other words, even allowing for quite 
arbitrary functional forms, city size does not predict economic 
output very well. 

The similarity of the RMS errors, and indeed of the curves, 
arises in part from the limited range of y. The difference be- 



tween the largest and smallest per-capita products (6.3 x 10* 
dollars/person- year) is "only" a factor of 5.2, i.e., not even 
one order of magnitude. This is too small, with only 366 ob- 
servations, to distinguish among competing functional forms 
for the trend, while still being quite consequential in human 
and economic terms. Per-capita production is simply not very 
strongly related to population. 

Taking any per-capita (intensive) quantity which is statis- 
tically independent of population, and looking at the corre- 
sponding aggregate (extensive) quantities will yield a scaling 
exponent close to one. The overwhelming majority of the ap- 
parent fit of the scaling relationship in Figure [T]is just such an 
artifact of aggregation. This can be shown in three different 
ways: by algebra; by extrapolating the different per-capita 
functional forms back to city-wide totals; and by simulation. 

Algebraically, suppose that y was statistically independent 
of A'^. Then In V = In y + In A*' would be the sum of two inde- 
pendent random variables, so its variance would be the sum of 
their variances. The R^ of a linear regression of In Y on In A'^, 
with the slope constrained to be 1, would be t? — n ^trl '"^f^^ n — r, 

^ ' Var[ln N^J+Var[ln yj ' 

which with this data comes to 0.94 That is, even if in- 

tensive, per-capita output was completely independent of city 
size, fitting a power-law scaling model to the aggregated data 
would capture 94% of the variance in the extensive, total out- 
put. The actual R^ , on the other hand, is only 0.960 

Figure [3] shows the same data and scaling curve as Figure 
[1] but three additional trend lines. These are the logarithmic, 
logistic and spline fits to the per-capita data (from Figure [2]) 
extrapolated back to the implied aggregated values Y. These 
are, visually, almost indistinguishable from the fitted power 
law; aU have R^ = 0.96. 

Figure [4] demonstrates in a different way that the data do 
not support the idea of power-law scaling. The circles in the 
figure show the actual data values. The stars, by contrast, 
are surrogate data simulated from the fitted logistic scaling 
model, with the actual population sizes. The surrogate per- 
capita output values y were set equal to the fitted values under 
the model of Eq. |4l and then randomly perturbed according 
to the empirical distribution of deviations from that model. 
The figure plots the surrogate aggregate products yN , which 
look very much like the data. 

If a power-law scaling relation is fit to the surrogate data 
from the logistic-scaling regression, then, averaging over many 
simulations, the median scaling exponent is 1.12, with 95% of 
the estimates falling between 1.10 and 1.15, and the median 
i?^ of the power-law was 0.96. Recall that the estimate for 
the actual data was 1.12, with a 95% confidence interval of 
(1.10, 1.15), and R^ = 0.96. 

The RMS error for Inj/ on the real data is very slightly 
lower for the logistic model (0.229) than for the power law 
(0.232). The difference is minute, but is, in fact, statistically 
significant: when repeating both fits on surrogate data sim- 
ulated from the power law, gaps of this size or larger occur 
only « 1% of the time. Not too much should be read into this, 
however, owing to the small magnitude of the difference, the 
large errors around both regression curves, and the compara- 
tively small number of observations. Reliably discriminating 
between the two models simply requires more information (in 
the sense of f23|) than the data provides: either much smaller 



A smoothing spline fit to the un-transformed data was similar, but visually somewhat more jagged. 
'''AII these measures of error are calculated on the same data used to fit the models, exaggerating 
the models' predictive powers. However, using six-fold cross-validation to approximate the out-of- 
sample risk gives RMS errors of 0.234 for the power law, 0.236 for logarithmic scaling, 0.232 
for logistic scaling, and 0.231 for the spline. The differences are small, but bootstrapping shows 
they are significant at the 5% level (at least). 

^Examples like this are why regression textbooks advise against using B?' to check goodness of fit 
[23I23I23. 
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fluctuations of y around the regression curve, or many more 
data points. 

To sum up these results, the appearance of a strong, super- 
Unear relationship between gross production Y and population 
is mostly driven by production growing in proportion to 
population — that is, linearly. Per-capita production y does 
not have a strong scaling relationship of any form with A'^, 
and the data are unable to distinguish between different func- 
tional forms for such trends as there are. The same is true 
of personal income (SI, section 1). Lacking ready access to 
the data sets on patents, crime, infrastructure and resource 
consumption which Ref. 3] analyzed in the same way as eco- 
nomic output and personal income, I cannot say whether the 
reported scaling relations for those aggregate variables suffer 
from the same problem. 

Superiority of the Urban-Hierarchy Model. To address the 
question of why there is a weak and noisy tendency for per- 
capita output to rise with population, I turn to the log- 
additive model, Eq. (6] Fitting to the data yields the par- 
tial response functions shown in Figure [S] As expected from 
the urban-hierarchy argument, all four of the partial response 
functions are monotonically increasing, so that rising shares 
of those industries predict increasing per capita production. 
Very notably, however, the estimated power-law scaling expo- 
nent is actually negative, —2.6 x 10~^, but statistically indis- 
tinguishable from zero (standard error 2.8 x 10"^). That IS, 
in the log-additive model, controlling for these four industrial 
sectors makes population effectively irrelevant for predicting 
urban productivity. Indeed, dropping population from the 
model altogether produces no appreciable difference in the 
fit. At least at the level of expectation values, controlling for 
these four industrial sectors "screens off" the effects of city 
size on per-capita production. 

Statistically, there is no question that the log-additive 
model predicts better than the simple scaling model. The 
RMS error of the former, on the log scale, is 0.218, correspond- 
ing to an of 38.8%, and an accuracy of ±24% or $6.8 x 10^, 
better than any model based on size alone. The log-additive 
model is a more flexible speciflcation, and so over-fitting to the 
data is an issue, but this can be addressed by cross-validation, 
which directly measures the ability of a model to extrapolate 
from one part of the data to another [13]. The cross- validated 
mean squared error of the log-additive model is 0.053, while 
that of the pure power law is 0.067, clearly showing that the 
extra complexity of the former is being used to capture gen- 
uinely predictive patterns, and not merely to memorize the 
training dat43- 

The simple log-additive model is unlikely to be a fully ad- 
equate predictor of systematic differences in urban productiv- 
ity. If nothing else, these four coarse-grained industrial sectors 
were selected merely for convenience, as approximate indica- 
tors of position in the urban hierarchy, and presumably could 
be improved. Moreover, the model does not even try to rep- 
resent the interactive processes which lead cities to have the 
industrial mixes that they do. In reality, these industries can 
be so concentrated towards the largest cities, at the top of the 
hierarchy (e.g.. New York), and away from lower-rank cities 
(e.g., San Francisco, Peoria), only because all these cities are 
part of a single national, and even international, division of 
labor d. 

"The Pace of Life" . A further claim of Ref. [3] is that the 
speed at which people walk grows as a positive power of the 
number of people in a city. The source given for this is Ref. 

id) , a two-page letter to Nature in 1976. The authors of Ref. 

10| went to 15 cities, towns and villages, picked locations 



and individuals which seemed to them to be comparable, and 
timed how long it took them to walk fifty feet (15.2 meters). 
Such unsystematic data, however intriguing, is too weak to 
support substantial scientific conclusions. Nonetheless, it is 
instructive to examine it, as in Figure [5] 

The original plot (Figure 1 in Ref. [13]) showed popula- 
tion on a log scale, and speed on a linear scale, as in Figure [5] 
The linear regression, for this transformation of the data, cor- 
responds to assuming that speed grows logarithmically with 
population, v ^ r In N/k. Figure 2a in Ref. ;31] re-plots the 
same data, but with the vertical axis on a logarithmic scale, 
so the linear regression assumes speed grows as a power of 
population, v ~ cA''''. (Neither figure included error bars, 
though Bornstein and Bornstein give the standard deviations 
in their caption.) As can be seen from Figure [5] both of these 
regressions, along with logistic scaling, are very similar in this 
data, while they embody very different assumptions, and at 
most one can be right. 

The explanation for this apparent paradox is that the 
range of reported walking speeds is small, from 0.7 m/s to 
1.8 m/s, and if \x\ ^ 1, then Inl + x ~ x. Observed over a 
narrow range, then, logarithmic and power law scaling simply 
are very similar, and hard to distinguish. This is also why the 
the power-law and logarithmic fits to per-capita production 
in Figure [2] were so close. 

Discussion 

Neither gross metropolitan product nor personal income scales 
with population size for U.S. metropolitan areas. The appear- 
ance of scaling in Refs. [3|, is an artifact of inappropriately 
looking at extensive variables (city-wide totals) rather than 
intensive ones (per-capita values). Scaling is also unpersua- 
sive for walking speed. I was not able to examine the other 
variables claimed to show scaling in Refs. 0,13, but, as they 
were all extensive variables, the analyses reported there would 
be subject to the same aggregation artifacts. It is also possible 
that cities in the contemporary United States are anomalous, 
and that scaling of income and economic output holds else- 
where. 

It is evident from Figures [2] (and Supplemental Figure SI) 
that there is a weak tendency for per-capita output and in- 
come to rise with population, though the relationship is sim- 
ply too loose to qualify as a scaling law. (Arguably, the real 
trend in those figures is for the minimum per-capita output to 
rise with population, though I would press this point.) Qual- 
itatively, this is what one would expect from well-established 
findings of economic geography The data do not really 
support any stronger quantitative statement. In particular, 
asserting any specific functional form, such as a power law, 
goes far beyond the what the data can support. Nor is there 
any theory, supported on independent grounds, which predicts 
a specific functional form. Accordingly, extrapolations based 
on such claims (e.g., the finite-time singularity in the model 
for city growth in Q]) are speculative at best. The amplitude 
of fiuctuations around the trend lines are, in any case, so large 
that predictions based on size alone can have very little utility. 

By taking account of the shares of just a few industries in 
the gross metropolitan product, we can obtain much better 
predictions of the level of per-capita production. In this sta- 
tistical model, summarized in Eq. [B] population plays no sig- 
nificant direct role in predicting per capita economic output, 
and could in fact be profitably ignored. Rather, the industrial 
sectors used are chosen as signs of where metropolitan areas 



Dropping population size N from the log-additive model altogether actually improves the cross- 
validation score, very slightly, to 0.052. 
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stand in the urban hierarchy, which is also related, of course, 
to size. One could interpret this as the mechanism by which 
size scaling happens (to the extent that it does), but this 
would imply that an exogenous increase in a city's population 
would automatically shift its industrial pattern, which is im- 
plausible. Indeed, the whole scaling picture for cities seems to 
rest on an oddly monadic, interaction-free view of metropoli- 
tan areas. The logic of central place theory, in contrast, relies 
on cities being part of an interactive assemblage, coupled by 
processes of production, distribution and exchange. This not 
only seems more plausible, but also better matches the evi- 
dence at hand. 

As Refs. 0, m, 01 have stressed, developing a sound sci- 
entific understanding of cities should be a priority for an in- 
creasingly urban species. In seeking such understanding, it is a 
sound strategy to begin with simple hypotheses, and to reject 
them in favor of more complicated ones only as they prove un- 
able to explain the data. This is not because the truth is more 
likely to be simple, in some metaphysical sense, but because 
this strategy leads us to the truth faster and more reliably 
than ones which invoke needless complexities [23. The ele- 
gant hypothesis of power-law scaling marked a step forward 
in our understanding of cities, but it is now time to leave it 
behind. 

Appendix 



Personal Income 

The BEA also makes available estimates of personal income 
by metropolitan area, a variable closely related to, but not 
quite the same as, the gross metropolitan product. (See 
|http: //www .bea. gov/regional/reis/ for definitions, estima- 
tion techniques, and data.) Ref. [3j reports that total personal 
income L also scales as a power of population, implying per 
capita personal income L/N = I should scale likewise. Figure 
[7|plots I versus A'', with the best-fitting power law, logarithmic 
relationship, and spline. 

Once again, the appearance of power-law scaling in the 
aggregate variable is not supported by examination of the per- 
capita values. The RMS error, on the log scale, of predicting 
a constant per capita income over all cities is 0.179, while the 
RMS errors of the power-law, logarithmic and logistic scal- 
ing relations are 0.157, 0.158 and 0.156, and that of the spline 
0.154. Even the spline thus has an of only 0.26. Repeating 
the procedures of Figures 3 and 4 from the main text yields 
similar results. Thus, personal income also fails to display 
non-trivial scaling with population. 



Mixtures of Scaling Relations 

Recall that the posited scaling relation is y ^ cN^. As shown 
above, this does not fit the data, at least not assuming, fol- 
lowing Ref. :3], that both parameters, the scaling exponent 
b and the pre-factor c, are the same for all cities. A natu- 
ral way to try to reconcile the data with the model would be 
to modify the latter, allowing c to depend on the type of the 
city. The rationale for such a regression would be that there 
are several different kinds of cities, and that city type shifts 
the over-all level of production up or down, but, once that is 
factored out, all cities scale with size in the same way. This 
common scaling exponent would not, naturally, be the same 
as the one estimated from the pooled data. 

Formally, we introduce a latent variable Z for each city, 
treated as a discrete random variable independent of TV, and 



consider the statistical model y ^ czN^- This leads to a 
"mixture-of-regressions" or "latent-class regression" model, 
which can be fit by the expectation-maximization algorithm 
[25I ]. Such fitting would lead not only to estimates of b and 
the pre-factors Cz, but also to the probability that each city 
belonged to each of the different city types or mixture com- 
ponents, categorizing cities inductively from the data. 

To investigate this, I fit mixture-of-regression models to 
the data from Figure 2 in the main text, varying the number 
of mixture components from 1 to 10, using the software of Ref. 
[2^13 To determine the correct number of mixture compo- 
nents, I used both Schwarz's "Bayesian" information criterion 
and cross-validation, which are both known to be consistent 
for such mixture problems, unlike the Akaike information cri- 
terion, which over-fits [2g|. Both BIG and cross-validation 
strongly favored one mixture component, meaning that the 
fit to the data is not actually improved by allowing for multi- 
ple scaling curves. 

This does not completely rule out the y ~ czN'' model, 
as only 366 observations may not have enough information to 
simultaneously induce appropriate categories and fit scaling 
relations. An alternative would be to expand the information 
available, by defining the categorical variable Z in terms of 
measurable attributes of cities other than A*' and y, such as 
geographic location or the mix of industries. (See Ref. [27| on 
such variable-intercept, constant-slope regressions with known 
categories.) Success with such models hinges on selecting cat- 
egories to represent important features of the data-generating 
process, a task I must leave to other inquirers. 

Assuming that such a statistical model works, there would 
still be the question of its interpretation. Whether one would 
judge such a model to really show scaling in urban assemblages 
would depend on how much importance one gives, on the one 
hand, to a common scaling exponent, and on the other to 
most of the fit coming from the un-modeled differences across 
city types. 



Residuals 

Ref. [i| proposes ranking cities not by their per capita values 
of quantities like economic production or patents or crime, but 
by the deviation, positive or negative, from the scaling rela- 
tionship, i.e., by the residuals of the trend lines. (It does not 
compare this to ranking by per capita values. The Spearman 
rank correlation between the two variables is 0.87 for GMP 
and 0.83 for personal income.) They consider both a Gaus- 
sian distribution for the residuals, i.e., a probability density 
f{x) oc ^'^'^ , and a Laplace distribution, f{x) oc e""^'^', 
and claim that both fit very well. 

Figure [8] shows the situation for GMP. Visually, nei- 
ther distribution matches the residuals well. Quantitatively, 
goodness-of-fit can be checked by "data-driven smooth tests" 
[281 ]. which transform their inputs so that they will be uniform 
if and only if the postulated distribution holds, and then mea- 
sure departures from uniformity (coefficients from expanding 
the transformed empirical distribution in a series of orthogo- 
nal polynomials) . Such tests reject both the Gaussian and the 
Laplace distribution with high confidence (p- values of 1 x 10 ~^ 
and 8 x 10~^, respectively, calculated using code provided by 
Ref. 129]). 



For computational reasons, it is easier to fit the more general specification in which the scaling 
exponent is also allowed to vary, y c^N^Z _ (Sharing a parameter across the regressions 
complicates the maximization step of the expectation-maximization algorithm.) If the constant- 
exponent model is right, the estimated exponents for each mixture component should agree to 
within statistical precision. 
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Results for personal income are similar (Figure |9]). The 
Gaussian distribution can be rejected with high confidence 
{p < fO~*). While the data do not rule out the Laplace dis- 
tribution in the same way (p = 0.27), the limited power of the 
test at the comparatively small sample size means that there 
is not strong evidence in its favor either. (See Ref. [131 on the 
evidential interpretation of significance tests.) 
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Fig. 1. Horizontal axis: population of the 366 US metropolitan statistical areas in 2006, log 
scale; vertical axis, 2006 gross product of each MSA, in constant 2001 dollars, log scale. (In 
all figures, grey inner ticks on axes mark observed values.) Solid line: ordinary least squares 
regression of log gross metropolitan product on log population, i.e., the regression Y ~ cN^, 
with estimated exponent b = 1.12. 
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Fig. 2. Horizontal axis: population, as in Figure[3 log scale. Vertical axis: gross product 
per capita, but on a linear and not a logarithmic scale. The two largest values are 7.8 X lO** 
dollars/person-year (in Bridgeport-Stamford-Norwalk, CT, a center for hedge funds and other 
financial firms) and 7.7 X 10"^ dollars/person-year (in San Jose-Sunnyvale-Santa Clara, CA, 
i.e., Silicon Valley), and the smallest are 1.5 X lO'^ dollars/person-year (in McAllen-Edinburg- 
Mission, TX and Palm Coast, FL). Black line: fitted power-law scaling relation. Blue line: fitted 
logarithmic scaling relationship. Grey line: logistic scaling. Red line: smoothing spline fitted to 
the logged data. 
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Fig. 3. As in Figure[3 but with the addition of curves showing the scaling relations from 
Figure[2] extrapolated to aggregate rather than per-capita values. These are visually all but 
indistinguishable. 



Footline Author 



PNAS I Issue Date | Volume | Issue Number | 9 




MSA population 



Fig. 4. Axes: as in Figuref^and [3] Circles: Actual values. Stars: simulated values, with 
per-capita production figures drawn from the logistic (not power-law) scaling model. 
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Fig. 5. Horizontal axis: city population, logarithmic scale. Vertical axis: estimated pedes- 
trian speed in meters/second, plus or minus one standard deviation, linear scale. Blue line: the 
regression V ~ r\nN/k, as proposed by Ref. 10]. Black line: the regression V ~ cN^ , as 
proposed by Ref. Grey line: logistic scaling. (Data from Ref. [l^, who report the mean 
and standard deviation of the time taken to walk 50 feet = 15.2 meters; I calculated standard 
deviations by propagation of error.) 
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Fig. 6. Partial response functions for the log-additive model (Eq.[6). Horizontal axes indicate the fraction of each metropolitan area's gross product derived from each 
industry, while the vertical axis shows the predicted logarithmic increase, or decrease, to per capita output, relative to the baseline of the mean over all cities. Solid curves are 
the main estimate, with dashed curves at ib2 standard errors in the partial response function. Dots show "partial residuals", the difference between actual lliy values and 
those predicted by the model including all the other variables. 
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Fig. 7. Personal income per capita versus population, 2006. Horizontal axis: population of 
MSAs (log scale). Vertical axis: personal income per capita, in nominal 2006 dollars (linear 
scale). Black line: power-law scaling curve (estimated exponent 0.082). Blue line: logarithmic 
scaling curve. Grey line: logistic scaling curve. Red line: spline fit to logged data. 
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Fig. 8. Horizontal axis: magnitude of residuals from power-law scaling of gross metropolitan 
product on population, i.e., from regressing \nY on InA^. Vertical axis: probability density 
of the residual distribution. Solid line: Nonparametric kernel density estimate (Gaussian kernel, 
default bandwidth choice — see Ref. 31]). Dashed line: maximum likelihood Gaussian fit to 
residuals. Dotted line: maximum likelihood Laplace (double-exponential) fit to residuals. 
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Fig. 9. As in Figure[8] but showing the deviations of personal income from power-law scaling. 
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